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^ Introduction 
C3 



The Cosmological Microwave Background (CMB) is of premier importance for the cosmologists to study the birth of our 
universe. Unfortunately, most CMB experiments such as COBE, WMAP or Planck do not provide a direct measure of the 
cosmological signal; CMB is mixed up with galactic foregrounds and point sources. For the sake of scientific exploitation, 
measuring the CMB requires extracting several different astrophysical components (CMB, Sunyaev-Zel'dovich clusters, 
galactic dust) form multi-wavelength observations. Mathematically speaking, the problem of disentangling the CMB 
map from the galactic foregrounds amounts to a component or source separation problem. In the field of CMB studies, 
a very large range of source separation methods have been applied which all differ from each other in the way they 
model the data and the criteria they rely on to separate components. Two main difficulties are i) the instrument's beam 
varies across frequencies and ii) the emission laws of most astrophysical components vary across pixels. This paper 
aims at introducing a very accurate modeling of CMB data, based on sparsity, accounting for beams variability across 
frequencies as well as spatial variations of the components' spectral characteristics. Based on this new sparse modeling of 
the data, a sparsity-based component separation method coined Local-Generalized Morphological Component Analysis 
(L-GMCA) is described. Extensive numerical experiments have been carried out with simulated Planck data. These 
experiments show the high efficiency of the proposed component separation methods to estimate a clean CMB map 
with a very low foreground contamination, which makes L-GMCA of prime interest for CMB studies. 
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Cosmological Microwave Background is generally not ob- 
servable directly; most CMB experiments such as WMAP 
or Planck provide multi-wavelength observations in which 
the CMB is mixed with other astrophysical components. 
Recovering useful cosmological information requires disen- 
tangling in the CMB data the contribution of several as- 
trophysical components namely CMB itself, Galactic emis- 
sions from thermal dust, spinning dust and synchrotron, 
Sunyaev-Zel'dovich (SZ) clusters, point sources, free-free 
emission, CO em ission to only name a few, see |Bouchet] 



fc Gispert] ( [1999 ). 

The recovery of these sources falls in the framework of com- 
ponent separation. The CMB itself dominates over a large 
fraction of the sky at frequencies lower than 100 GHz, but 
for an efficient component separation a range of channels 
needs to be observed. Component separation consists of es- 
timating a set of parameters which describe the components 
of interest as well as how they contribute to the data. For 
example, it could be parameters describing the statistical 
properties such as power spectra, electromagnetic spectra 
to only name two. 

A large range of component separation techniques have 
been proposed; they mainly differ from each other in the 
way they model the data and the assumptions made on 
the components to be separated. However, none of state-of- 
the-art component separation techniques accurately models 
CMB data. Such a modeling is challenging as it involves a 



wide variety of instrumental and physical phenomena such 
as : 



Instrumental noise : It is generally correlated and 
non stationary : each position in the sky is not observed 
the same number of times. The noise variance varies 
across pixels. 

Point sources : point sources are very hard to ac- 
count for in component separation : each point source 
has its own electromagnetic spectrum. Therefore the 
contribution of point sources can't be defined as a 
simple spatial template that scales across frequencies. 

Emissivity variations : It is a well established fact 
that the emissivity of most foregrounds such as dust 
or synchrotron varies across the sky. Again, assuming 
that the electromagnetic spectrum of these components 
is constant across pixels is inaccurate. 

Heterogeneous beams : the observed maps have 
generally different resolutions; furthermore, the beams 
are not necessarily isotropic and spatially invariant. 



Each observed channel x^, at frequency z^^, contains infor- 
mation about several sky emissions (CMB, dust, etc) which 
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can be written at each pixel k as follows : 



i[k] = biic 




(1) 



where i is the channel number, k is the pixel position, Ng 
the number of sky emissions, bi is the instrument beam {i.e. 
point spread function), aij[k] models for the contribution 
of source j in channel i, Sj is the j-th sky component and 
ni[k] is the noise. 

A first - commonly found - approach to get the CMB, called 
template fitting, consists in fitting sky templates tj (for all 
non-CMB sky emissions). But many other strategies have 
been investigated in the last years in order to extract the 
best CMB map from WMAP or Planck data. If we omit 
the effect of the beams and assume that the sky component 
spectral behavior does not vary across the sky (i.e. aij [k] is 
constant for any fixed (i^j)), then the above equation can 
be recast as : 



X = AS + N 



(2) 



Where X are the observed multichannel data, S the 
unknown sources, and A is the unknown or partially 
unknown mixing matrix {aij is the contribution of com- 
ponent Sj to the channel i at frequency Ui). Therefore, 
one needs to find S and A from the observed data X 
only. This is obviously a highly ill-posed problem known, 
in the statistics community, as Blind Source Separation 
(BSS). During the last decades various methods have been 
introduced to tackle BSS problems. The main differences 
between all these component separation methods are the 
statistical assumptions made to differentiate between the 
sources. A famous example is Independent Component 
Analysis which rely on the statistical independence of the 
sources. Although independence is a strong assumption, 
it is in many cases physically acceptable, and provides 
much better solutions than using a simple second order 
decorelation assumption generally obtained with methods 
such as using Principal Component Analysis (PC A). 
Another example is Internal Linear Combination (ILC), 
which is not stricto sensu a component separation but 
rather a component extraction method in the sense 
that it only estimates a single component with assumed 
known spectrum. In statistics, ILC is closely linked to 
the "Best Linear Unbiased Estimator" {a.k.a. BLUE) 
with the assumption that the covariance of the error is 
identical to the covariance of the data. The most basic 
ILC method amounts to applying this estimation formula 
to the data in the p ixel domain. Furt her refinements have 
been introduced in 'Delabr ouille et al.| (|2009|) where the 
ILC estimator of the CMB is computed in localpatches in 
the wavelet space using so-called Needlet filterq^ 
Beyond statistical independence, the Generalized 
Morphologic al Component Analysis method (GMCA) 
dBobin et al.||2QQ7| |Bobin et al.| |2QQ8) makes profit of 
the fact that most foreground emissions can be sparsely 
represented in a well chosen signal representation {e.g. 
wavelets). The estimation of the components and the 
unknown coefficients in the mixing matrix is performed by 



^ Like undecimated wavelets on the sphere ( Starck et al. 2006 ) , 
the needlet signal representation is a tight frame computed via 
filter banks in spherical harmonics. 



enforcing the sparsity of the components in the wavelet 
domain. 

In addition, parametric meth ods such as MEM [Hobson 
et al. (1999) and Commander [Eriksen et al. (2008) have 
been considered, based on a full physical modeling of the 
sky components. 

Except for GMCA[^ codes relative to these different 
methods are not public. This makes the comparison be- 
tween them relatively difficult. Ho wever, a first comparison 
of these methods has been done in Leach ( 2008 ) on full-sky 
simulated Planck data. Whatever the kind of approach 
one adopts, efficient CMB estimation requires an accurate 
modeling of the data as well as a robust and effective 
separation technique. Also, with the future Planck release 
in 2013, we need to have a much better understanding of 
what methods work better to recover a high quality CMB 
map from full-sky surveys. 



Content of this paper : In this paper, we first review 
in Section [T] the major classes of component separation 
methods, and we discuss the advantages and drawbacks of 
each of them. Section [2] discusses the use of sparsity prior 
for CMB estimation. Then we show in section [4] how the 
GMCA method can be modified in order to properly take 
into account the different resolutions of the different chan- 
nels, and the spatial variation of the mixing matrix. Based 
on this sophisticated data modeling, a novel sparse compo- 
nent separation method Local GMCA (L-GMCA) coined 
is described. Finally, results of extensive numerical exper- 
iments are presented in Section [5j which show the advan- 
tage on the sparse recovery method over the ILC-based ap- 
proach. 



1. CMB Recovery 

Components separation techniques can be grouped in a 
few classes according to the criterion used for separating 
the sources. In this section we present the various meth- 
ods proposed for CMB recovery and discuss their relative 
advantages and drawbacks. 



1.1. Template Fitting 

Having the foregrounds as additional components of the 
microwave sky, one can perform a fit of the template to 
the data for foreground analysis. For Nt template vectors, 
... 7Vt7 template-corrected data has the form at 
frequency Ui 



Nt 



PA, 



(3) 



where the best-fit amplitude, for each foreground tem- 
plate can be obtained by minimizing xJC~^Xi^ where C is 
the total covariance matrix for the template-corrected data 
C = (xixj). Template- fitting can be performed either in 
the pixel domain or in the harmonic space. Pixel-based 
implementation allows for incomplete sky coverage as 
well as a refined modeling of non-stationary noise at the 



^ GMCA is part of the ISAP toolbox 
http:/ / jstarck.free.fr/isap. html 
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expense of a more complex modeling of the data : the 
CMB covariance matrix is large and dense. Conversely, 
spherical harmonics make the modeling of CMB much 
simpler but requires crude stationary assumptions for noise. 

CMB cleaning by template fitting has a number of 
advantages, the first to be its simplicity. The technique 
makes full use of the spatial information in the template 
map, which is important for the non-stationary, highly non- 
Gaussian emission distribution, typical of Galactic fore- 
grounds. xThere are also disadvantages to this technique : 
the noise of the template is added to the solution, and 
imperfect models of the templates could add systematics 
and non-Gaussianities to the data. Template fitting also 
assumes that the electromagnetic spectrum of the emission 
related to a given template is not varying spatially, which 
is generally not true. 



Refer to Dunkley et al. ( |2QQ9 l) for a more detailed de- 
scription of template fitting techniques. 
Template cleaning of the COBE/FIRAS data reduced a 
complicated foregr ound by a fa ctor of 10 by using only 3 
spatial templates ( [Fixsen et al.||1998| ). 



1.2. Second order statistical methods 
ILC: Internal Linear Combination 

In the framework of ILC, very little is assumed about the 
different components to be separated out. The main compo- 
nent is assumed to be the same in all the frequency bands 
and the observations are calibrated with respect to this 
component. Each observation Xi is modeled at pixel k as 
follows : 

Xi[k]=s[k]+Mk]+ni[k], (4) 

where i denotes the frequency channels at frequency 
fi[k] and ni[k] are the foregrounds and noise contributions 
at pixel k respectively. One then looks for the solution : 



s[k] = ^Wi[k]xi[k] 



(5) 



The simplest version of ILC assumes the weights Wi [k] are 
constant across the sky and therefore are not function of 
k. The ILC estimate of the CMB is then obtained by esti- 
mating the weights Wi which minimize the least square of 
the residual Xi — s for all i. Assuming that the covariance 
matrix of the error the is covariance matrix of the data 
themselves (this assumption is a good approximation when 
the foregrounds and/or intrumental noise are preeminent) : 



s = mm 



a s\ 



(6) 



where the norm || . ||f,5:x stands for the weighted Frobenius 
norm : 

||Y||i.,5:x = Trace (Y^5]x"^Y) and a is the electro- 
magnetic spectrum of CMB (it is a vector of ones for data in 
thermodynamic units). Solving the minimization problem 
in Equation [6] leads to : 



s = 



Note that the ILC yields the minimization the total vari- 
ance of the ILC map. Away from the galactic plane and on 
small scales, the best linear combination for cleaning the 



CMB map from foregrounds and noise might be different 
from regions close to the galactic plane or the large scales. 
To improve on this, the map is generally decomposed into 
several regions and ILC is applied to them independently. 
The ILC performs well when no prior information is known 
about the different components : the only prior knowledge 
is the CMB electromagnetic spectrum. Therefore, ILC is 
considered as a foreground removal technique rather than 
a component separation technique. 

Correlated Connponent Analysis (CCA) 



This method ( Bedini et al.||2QQ5| ) is a semi-blind approach 
that estimates the mixing matrix on sub-patches of the sky 
based on second order statistics. It makes no assumptions 
about the independence of the sources. This method adopts 
the commonly used models for the sources to reduce the 
number of parameters estimated and exploits the spatial 
structure of the source maps. The spatial structure of the 
maps are accounted for through the covariance matrices at 
different shifts (r, ijj) 

Cd{r, ^) = AC sir, ^)A' + C,(r, ^) , (7) 

where Cd{r,ilj) is estimated from data and the noise co- 
variance matrix Cn(r, -0) is derived from the map-making 
noise estimations. Then by minimizing the equation 



J2 \\AC,{t,^)A' - [Cd{T,^) - C„(t,^)]| 



(8) 



T,1p 



where the Frobenius norm is used, one can estimate the 
mixing matrix and the free parameters of the source covari- 
ance matrix. Given as estimate of Cs and C^, the above 
equation can be inverted and component maps obtained 
via the standard inversion techniques of Wiener filtering or 
generalized least square inversion. To obtain a continuous 
distribution of the free parameters of the mixing matrix, 
CCA is applied to a large number of partially overlapping 
patches. 

Spectral Matching ICA (SMICA) 



SMICA dPelabrouille et al.||2QQ3D is a ICA-based compo- 
nents separation technique relying on second order statis- 
tics in the frequency or spherical harmonics domain. For 
multichannel maps Xi[k] one computes 



2i 



(9) 



for each i and m. One then models the ensemble- average 
diS Ri = (^Ri^ = R^ where the sum is over the com- 
ponents. For each component, R^^ is a function of a param- 
eter vector 0^ , where the parameterization embodies the 
prior knowledge about the components as well as the mix- 
ing matrix. The parameters are determined by minimizing 
the spectral mismatch 



(10) 



where K{Ci\C2) is a measure of mismatch between Ci 
and C2 (typically the Kullback-Leibler divergence between 
two Gaussian distributions with same mean and covariance 
matrices R£ and R\ ) . 
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1.3. Parametric methods 
Maximum Entropy Method (MEM) 

Having a hypothesis H in which the observed data 
{xi}i=i^... are a function of the underlying signal s one 
can follow Bayes' theorem, which tells us the posterior prob- 
ability is the product of the likelihood and the prior prob- 
ability divided by the evidence; 



P{s\d,H) 



P{d\s,H)P{s\H) 



(11) 



Then following the maximum entropy principle, one uses 
a prior distribution which maximiz es the entropy of th e 



estimate given a set of constraints Hobson et al. ( 1999| 



The MEM can be implemented in the spherical harmonic 
domain where the separation is performed mode-by-mode 
which speeds up the optimization. Based on this maximum 
entropy approach, FastMEM is a non-blind method, which 
means that the spectral behavior of the components must 
be known beforehand. Further details of this method are 



presented in Stolyarov et al. (2005). 



foregrounds (free-free, synchrotron and dust emissions) 
but with the assumption that these electromagnetic 
spectrum of these components is fixed across th e sky . 
To our knowledge, only ILC [Delabrouille et alTl ( |2009 ) 
in the needlet domain has been extended to perform on 
patches to allow for space varying ILC weights. 

Separation criteria : When no physics-based as- 
sumptions are made on the compo nents, blind source 
separation t echniques such as CCA (Bedini et al.||2005 ) 
or SMICA ( [Delabrouille et al.||2003 [) rely on statistical 
separation criteria to differentiate between the com- 
ponents. As described in the previous section, CCA 
and SMICA both make use of second-order statistics 
to separation components, either covariance matrices 
in the pixel domain for CCA and ILC or in spherical 
harmonics for SMICA. If second-order statistics provide 
sufficient statistics for Gaussian random fields like the 
CMB, it is no more optimal for non-stationary and 
non-Gaussian components such as foregrounds. In 
this, higher-order statistics should also play a preem- 
inent role to measure discrepancies between the sources. 



1.4. Comparisons of assumptions and modeling 

These different methods give a rather large overview of the 
different approaches used so far to estimate the CMB from 
mult i- wavelength observations. They all differ in the way 
they model the data and the assumptions made to disen- 
tangle the different components : 

— Instrumental noise : only pixel-based or wavelet- 
based methods are able to, at least approximately, 
model for the variation across pixels of the noise vari- 
ance. Methods based on spherical harmonics rely only 
on the power spectra or on cross spectra to perform 
the separation; instrumental noise can be accounted 
for via its power spectrum which do not give a precise 
characterization of its non-stationarity. 

— Frequency beams : the beam of the observations 
varies across frequencies. The effect of the beam being 
a simple mult ipole- wise product in spherical harmonics, 
methods performing in this domain can effectively 
model for its effect. Pixel-based methods generally 
neglect the variation of the beam across frequency or 
at least estimate the mixture parameters {i.e. mixing 
matrix) at a common low resolution at the expense of 
the lost of small scale information. 

Component's mo deling : P arametric methods such 
as F astMEM Hobson et al.| (p.999) or Commander 



Eriksen et al.f^2008 ) are attempts to make use of 
accurate modeling of astrophysical components in- 
volving spatially variant parameters {e.g. spectral 
index and temperature of dust). The dimension of 
the parameter space growing with the resolution, 
parameters are generally estimated at low resolution 
and extrapolated to higher resolution. If this allows 
for a precise modeling of foregrounds at large scale, 
this model is still inaccurate to capture small scale 
variation s of the component s. BSS-based methods such 
as CCA ( Bedini et al.|2005 ) have also been extended so 
as to incorporate a parametric modeling of the major 



As emphasized in the introduction, designing a compo- 
nent separation methods allowing for an accurate mod- 
eling of the data {i.e. space- varying noise variance, het- 
erogenous beams) as well as a precise modeling of the data 
{i.e. accounting for the space- variant spectral characteristic 
of components, effective separation criterion for both non- 
Gaussian foregrounds and CMB) is a challenging task. Up 
to know, none of the proposed methods takes into consid- 
eration all the aspects of data and component modeling. 

1.5. Towards wavelets and sparsity 

The discussion of the previous section sheds light on the 
respective advantages of data modeling in the pixel space 
and spherical harmonics. However, it appears clearly that 
none of these two different approaches appropriately deals 
with the separation of non-stationary and/or non-Gaussian 
signals as well as the correlation between pixels of the com- 
ponents 

Taking the best of both approaches is generally made by 
switching to a wavelet-based modeling of the data : i) the 
wavelet decomposition of the data leads to a splitting of the 
spherical harmonics domain which allows for a localization 
in frequency or scale together with a localization in space. 
Hence, harmonic methods such as SMICA as well as pixel- 
based techniques such as ILC have been extended with suc- 
cess in the wavelet domain , using isotropic w avelets on the 



sphere ([Starck et al. 



to WSMICA (Moudd 



2006[ iMarinucci et al.j |2Q08), leadin 
en et al.||2005|), N-ILC Ibelabrouill^ 



et al.,200 9) and GenlLC ( [Remazeilles et al.|201ip methods. 

Owing to the spatial localization of the wavelet repre- 
sentation, CMB estimation is carried out on different re- 
gions of the sky in the different wavelet scales which allows 
for a more effective cleaning of non-stationary components. 
Similarly, a template fitting technique such as SEVEM 
is now app lied in the wavelet domain (Fernandez-Cobos 
et al.| |2012l ) to capture scale-space variabilities of the tem- 
plates' emissivity. However, this method makes use of Haar 
wavelets on Healpix faces which is certainly sub-optimal 
since this specific wavelet function is irregular and exhibits 
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very poor mathematical properties. If the data modehng is 
more complex than a simple template fitting, this approach 
does not have the versatility of N-ILC and cannot capture 
variation of the spectral emission of a given component in- 
side a given scale. 

It is important to note that the wavelet transform has 
the ability to capture coherence or correlation across pix- 
els while averaging the noise contribution. This essential 
property is also known in the field of modern statistics as 
sparsity : correlated structures in the pixel domain are con- 
centrated in a few wavelet coefficients. As an illustration, 
Figure [T] displays the histogram of the pixel intensity of sim- 
ulated dust emission at 857GHz in the pixel domain in the 
top panel and in the wavelet domain in the bottom panel. 
This figure particularly shows that if all the pixels of dust 
are nonzero, the vast majority of its wavelet coefficients are 
very close to zero and only a few have a significant ampli- 
tude. This enlightens the ability of the wavelet transform 
to concentrate the geometrical content {i.e. correlation be- 
tween pixels) of dust emission in a few coefficients. 
Extensions to the wavelet domain of the above CMB es- 
timation techniques benefit from the space/frequency lo- 
calization of the wavelet analysis. However they do not 
make profit of the sparsity, and thus highly non- Gaussian, 
property of the wavelet decomposition of the compo- 
nents. Conversely, Generalized Morphological Component 
Analysis (CMC A) further enforces sparsity to better esti- 
mate the sought after sources in the wavelet domain. This 
component separation method is described in the sequel. 





Wavelet coefficient level 



Fig. 1. Histogram of simulated dust emission at 867GHz in 
pixel domain (top), wavelet domain at the (bottom). 



2. Sparse Component Separation: Generalized 
Morphological Component Analysis (GMCA) 

2.1. Sparsity for component separation 

Sparse priors have been shown to be ve ry useful in reg - 
ularizing ill-posed inverse problems (Starck et al. 2010). 
In addition, sparse priors using wavelet bases have been 
used with success to various signal processing problems 
in astrono my including denoising, d econvolution and 
inpainting (Starck & Murtagh 2006). Like ICA-based 
techniques, GMCA aims at solving a blind or semi-blind 
source separation pro blem. Howeve r, GMCA performs in 
the wavelet domain ( Bobin et al.| [2Q08) to benefit from 
the sparsity property of the foregrounds in this domain. 
It is worth mentioning that sparsity has also been used 
for compone nt separation in fields of research outside 
astrophysic s ( Zibulevsky fc Pear lmutter]| 2001 Bobin et al. 
20081 120091). 



In the sequel, we denote by ^ the forward wavelet 
transform and its backward transform. Mathematically 
speaking, this is a matrix made of wavelet waveforms. One 
can uniquely decompose each source Sj in the wavelet do- 
main as follows : 

aj are the expansion coefficients of source sj in the wavelet 
basis. The sparsity of the sources means that most of the 
entries of aj are equal or very close to zero and only a few 
has significant amplitudes. The multichannel data X can 
be written as 

X = Aa^ + AT, (12) 

The objective of GMCA is to seek an unmixing scheme, 
through the estimation of A, which yields the sparsest 
sources S in the wavelet domain. This is expressed by the 
following optimization problem written in the augmented 
Lagrangian form : 



\X-Aanl + ^J2\\^j\\p ' 



(13) 



where typically p = (or its relaxed convex version with 

1/2 

p = 1) and ||X||p = ^trace(X^X)^ is the Frobenius 

norm. For more tech nical details about G MCA, we refer the 
interested reader to|Bobin et al. (2008), where it is shown 
that sparsity, as used in GMUA, allows for a more precise 
estimation of the mixing matrix A and more robustness to 
noise than ICA-based techniques. 

Prior astrophysical knowledge can also be easily intro- 
duced in GMCA : this includes the electromagnetic spectra 
of components such as CMB and SZ. BSS methods such 
as SMICA or CCA are intuitively well understood as they 
rely on second-order statistics astrophysicists customarily 
use in their everyday life : correlations, covariance matri- 
ces, powerand cross-spectra. More details can be found 



in [Bobin et al. (|2008|). Sparsity is not only sensitive to sec- 
but also to the high er-order statistics of 



ond order statistics 



the components. In 
on toy model simu 



(2008), it has been shown 



Bobin et al. 

at ions (all frequency channels at the 
same resolution and no spatial variation of the mixing ma- 
trix) that the sparsity criterion was more efficient than the 
SMICA criterion to separate the different components. 
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2.2. Limitations of GIVICA 

According to the mixture model underlying GMCA, all 
the observations are assumed to have the same resolution. 
However, in most CMB experiments, this assumption does 
not hold true : the WMAP {resp. Planck) Full Width at 
Half Maximum (FWHM) varies by a factor of about 5 {resp. 
7) between the best resolution and the worse resolution. 
Assuming that the beam is invariant across the sky, the 
linear mixture model should be substituted with the fol- 
lowing : 



where Be = The final estimate is 

computed by inverting the above filter. 



Vz = !,••• ,M; 



where bi stands for the impulse response function of the 
PSF at channel i. This model can be expressed more glob- 
ally by introducing a global multichannel convolution op- 
erator B defined so that its restriction to the channel i is 
equal to a convolution with bi : 

X = S(AS) + N (14) 

To our knowledge, only blind source separation methods 
performed in spherical harmonics carefully account for the 
full mixture model in Equation [14] where such a model can 
naturally be simplified {i.e. the convolution operator is 
diagonalized by the spherical harmonics as in Fourier space 
for data sampled on regular Euclidean grids) . However this 
holds true as long as the beam is assumed to be isotropic 
and invariant across the sky. Extending the data modeling 
to anisotropic and/or space varying beams can only be 
made in the pixel domain. 



The standard version of GMCA does not model for the 
different resolutions of the data. For the sake of simplicity, 
the effect of the beam was neglected during the source sep- 
aration process. The estimation of the mixing matrix with 
GMCA was performed directly on the data assuming that 
the linear mixture model is valid. In this setting, the CMB 
map is evaluated by applying the Moore pseudo-inverse of 
the estimation mixing matrix to the raw data : 

s = [A%X 

where = {A^ A)~^ and the notation [Y]i stands 
for the i-th row of the matrix Y. Note that, by conven- 
tion, matrix elements related to the CMB map have index 
1. Neglecting the beam effect implies that the CMB map 
estimated by GMCA is biased. Hopefully this bias can be 
computed explicitly by observing that, for each (£, m) in 
the spherical harmonics domain : 



M 



where r is a residual term that models for noise and fore- 
grounds contaminating the estimated CMB. The variable 
^ stands for the true bias-free CMB in spherical harmon- 
ics. By neglecting the residual term, the beam-induced bias 
can be regarded as a filter in spherical harmonics Bi : 



M 



BfS^f , 



(15) 
(16) 



Neglecting the beam effect in the component separation 
process has however two major drawbacks : 

— Discrepancy with the hnear mixture model : 

the linear mixture model does not hold when the 
data do not share the same resolution. This leads 
very likely to a non-optimal parameter estimation 
process. According to the linear mixture model, the 
contribution of the CMB in the data is the same across 
frequencies for each (£, m) : this contribution is given 
by the electromagnetic spectrum of the CMB. This 
is no more true when the resolution varies from one 
observation to another : this contribution now varies 
across £. This means that carrying out component 
separation from the raw data without taking care of 
the beam effect should lead to an inaccurate unmixing 
procedure. The dominant energetic content of the 
components is mainly concentrated at low frequencies 
where the beams do not differ too much from each 
other. At these frequencies, the linear mixture model 
is a good first order approximation which may explain 
the seemingly good performances of GMCA in [Leach 
( |2QQ8[ ). However, performances at smaller scales should 
be enhanced by correctly modeling the beam in the 
separation process. 



Noise: Following the previous argument, the com- 
putation of the mixing matrix in GMCA is mainly 
driven by the low frequency content of the components. 
However the signal-to-noise ratio of the observations 
highly depends on their resolution; low resolution 
observations typically have a low SNR at high spatial 
frequencies. This entails that estimating the mixing 
parameters from the low frequency content for the data 
do not carefully account for the noise contamination at 
smaller scales; this is likely to lead to low SNR CMB 
estimates at high i. 



Like most component separation methods used so far, 
GMCA explicitly assumes that the mixing matrix does not 
vary across pixels. This is a strong limitation as it is clearly 
not suited to capture the expected emissivity variation of 
galactic foregrounds across the sky. 

We show in the next two sections how GMCA can be mod- 
ified to solve these problems. 

3. GMCA and frequency map resolutions 

3.1. Component separation from heterogenous data 

Accounting for the heterogeneity of the data in the 
separation process can be carried out in two different 
ways : the most straightforward technique would con- 
sist in adapting GMCA by substituting the data fidelity 
term in Equation 13 with the more rigorous expression : 



|X-S(AS) 



where B represents the beam effect. If 



this approach has been e xplored with som e success in a 
different imaging context ( Bobin et al.|2QQ9 ), its high com- 
putational cost makes it hard to apply to large-scale CMB 
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data. 

The second approach would rather consist in finding ways 
to apply GMCA to data that share the same resolution. 
This could be obtained by first converting all the obser- 
vations to the common desired resolution {e.g. 5 arcmin). 
However, the beams at low frequency vanish much faster 
than at higher frequency; this entails that a brute-force 
deconvolution of the low frequency channels will yield a 
large amplification of the noise. In practice, such a decon- 
volution is also prohibited by numerical issues. It is impor- 
tant to underline that low resolution channels will mainly 
add noise to the final estimate at high spatial frequencies 
(or equivalently high i multipoles in the spherical harmon- 
ics domain). Therefore low resolution observations will be 
most informative at low frequencies and should be avoided 
for the reconstruction of the high frequency content of the 
CMB map. In the next paragraph, we introduce an elegant 
strategy to extend GMCA so as to cope with data involving 
frequency-dependent beams. 

3.2. Multiscale GMCA 

A solution to this problem is to adapt the wavelet decom- 
position for each channel such that the wavelet coefficients 



of the M available channels at scale ji do have 
exactly the same resolution. This can be easily obtained 
by choosing a specific wavelet function for each channel i 
(i = ,M) such that: 



where bj is the beam of the i-th channel, h 



•) ^target 



(17) 



the 



Gaussian beam related to the targeted resolution, and il)^ 
is the standard wavelet function at scale ji. This approach 
is closely related to t he wavelet-bas e d deconvolution tech- 
niques introduced in Donoho (1995); Khalifa et al. (2003) 
and called, in the statistical literature, wavelet- vaguelette 
decomposition. According to this formalism, the wavelet- 
vaguelette decomposition of a function / is defined by 



btarget ^ / = E E (^^ ^ /' ^k^)^i'^ ^ 
/i=l 

where is the number of wavelet scales and Np is the 
number of pixels in the map. 



In this paper, we propose extending these ideas to the 
case of component separation. Replacing the data matrix 
X by the matrix of wavelet coefficients W (W^^* = Wi = 

^Xi,^^^-^^), and applying the pseudo inverse of A to W, 

we get Z = A^W. The quantity Zj^^ is now related to the 
wavelet coefficients of the j-th source in the /i-th wavelet 
scale at location k. The estimated sources can then be es- 
timated via the following reconstruction formula : 



/i=l k=l 



(19) 



where the notation sj denotes the estimate of the j-th 
source. The reconstruction formula cannot be implemented 
in practice, because the wavelet coefficients of the i-th 



channel w^^^ at scale /i cannot be calculated when the 



fraction 



is undefined or numerically un- 
stable. It means that at each wavelet scale /i, only the 
channels with non- vanishing beams can be used. At the 
largest wavelet scale, all channels are active, while at the 
finest wavelet scale, only few channels are active. 

We note the number of channels available for a 
given resolution level /i, and N^^^ the number of wavelet 
scales that can be used for a given source j. This implies 
introducing a mixing matrix A^^^ per wavelet scale; this 
mixing matrix will be evaluated from the channels 
available at scale fi. The size of these matrices and the 
number of sources (which is limited to the number of 
channels) are varying with ji. For each wavelet resolution 

level /i, we now have a solution : 



where Z^^) = A^^^^W^^) and w[^^ 



(20) 



XiMtk)^ with 



i = 1, • • • , C^, and /i = 1, • • • ,N^. 

The final solution Sj for the j-th source is obtained by 
a simple wavelet reconstruction : 



N 



Np 

EE 

/i=l k=l 



(21) 



Multiscale GMCA (mCMCA) is similar to a harmonic 
space method, where we consider one mixing matrix per 
wavelet band (or frequency band), but on the contrary of 
SMICA, the mixing matrix is calculated from high order 
statistics of wavelet coefficients. Like SMICA, mCMCA can 
properly take into account the resolution of the different 
channels but with the paramount advantage that it does 
not make any assumption on the stationarity of the sources. 

3.3. Practical Implementation 

Though the wavelet-vaguelette source separation method 
seems a complicated procedure, it can be largely simplified. 
Indeed, thanks to the linearity of both the beam convolu- 
tion operator and the wavelet operator, the matrix W*^^^ 
relative to the active channels at resolution level /i can be 
computed by putting the active maps at the same resolution 
(which depends on /i) and then compute the same wavelet 
decomposition on each of them. Once the matrix W^^^ is 
obtained, one can run the GMCA algorithm to get the mix- 
ing matrices A^^^. This can be repeated for each resolution 
level /i. This way, a CMB map is evaluated for each res- 
olution level, with a number of active channels decreasing 
with scales. The final solution is then derived by aggregat- 
ing these solutions in the wavelet space as in Equation [21] 
Another advantage of this approach is its ability to benefit 
from the structure of the Healpix format : different values 
for the parameter nside can be chosen depending on the 
resolution level, which speeds up the computation time. 
As an illustration, we give a possible parameterization of 
mCMCA for Planck data. In this case, we have nine chan- 
nels from 30 to 857 GHz, with a resolution which goes 
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Observations used 


Scale 




30 to 857 GHz 


33 arcmin 


9 


44 to 857 GHz 


24 arcmin 


8 


70 to 857 GHz 


14 arcmin 


7 


100 to 857 GHz 


10 arcmin 


6 


143 to 857 GHz 


5 arcmin 


5 



Table 1. Example of resolution levels to use in mOMCA, 
with the number of active channels per resolution level. 



roughly from 33 arcmin to 5 arcmin. We therefore have 
considered five resolution levels, with a number of active 
channels varying from 9 to 5 (see Table [T]). 

However, the underlying modeling of mOMCA does not 
allow for a precise separation of components with spectral 
variations. Next section will show that the spatial varia- 
tion of the matrix can also be taken into account using a 
partitioning of the wavelet scales. 

4. GMCA and spatially variant mixing matrix 

4.1. A refined modeling to get closer to astrophysics 

As emphasized in the introduction, the complexity of CMB 
data makes it very hard to fully model for all the physical 
phenomena with a simple linear mixture model. The linear 
mixture model used so far in most component separation 
methods assumes that : i) the number of components is 
limited to the total number of observations, ii) explicitly 
assumes that the emissivity of the component is space in- 
variant {i.e. the mixing matrix does not vary from one pixel 
to another). Unfortunately, these assumptions are not ver- 
ified by common CMB data. The components one observes 
between 30GHz and 857GHz include the CMB, Sunyaev 
Zel'Dovich effect, free-free, synchrotron, CIB, anomalous 
dust and dust emissions as well as IR and radio sources 
the number of which largely exceeds the number of obser- 
vations provided by Planck. 

From the current knowledge in astrophysics, some of these 
components can be approximated quite accurately with 
space-invariant emissivity; this is the case for the CMB, 
SZ effect, free-free and synchrotron emission. 
The dust emissions provide a very important, if not domi- 
nant, contribution in high frequency channels. These chan- 
nels have the best spatial resolution {typ. 5 to 10 arcmin); 
modeling for such a component should thus be of utmost 
importance for a high-resolution estimate of the CMB map. 
However modeling dust emission is a strenuous problem. 
Indeed, contrary to well-characterized foreground emissions 
such as free-free or synchrotron, most component separa- 
tion models do not provide a satisfactory modeling of this 
contribution. As an example, the gray body dust model, 
known as being one of the most accurate dust model, in- 
volves two parameters : a spectral index and temperature 
varying across pixels. 

These remarks imply that the linear mixture model does 
not allow for enough degrees of freedom to fully capture the 
complexity of CMB data in the frequency range observed by 
most CMB surveys like Planck. In the following, we will fo- 
cus on extending this model to account for spatially- variant 
spectra. 



4.2. Multiscale local mixture model 

The global mixture model used by most component sepa- 
ration methods do not allow for enough degrees of freedom 
to capture naturally scale/space-dependent astrophysical 
phenomena. Scale-dependancy of the analysis naturally 
arises from the mGMCA formalism, localization requires 
decomposing each wavelet scale into patches. It is worth 
mentioning that localizing the estimation of the CMB has 
also been pr oposed within the ILC framework ( Delabrouille| 
et al.||2009D to analyze WMAP map data. N-ILC consists 
in : ij decomposing each wavelet scale into patches with 
a scale-dependent size, ii) performing ILC on each patch 
independently. However, the spectral behavior of many 
components is not expected to change dramatically from 
one patch to its neighbor; such an independent processing 
each on patch may not be optimal. 

For this purpose, we extend the mixture model to a 
multiscale local mixture model. In such modeling, each 
location of the sky in each wavelet scale is analyzed several 
times with different patch sizes which allows to locally se- 
lect the best parameters. 

Before going any further, let us first recall some useful no- 
tations : if X denotes the data, we will denote by W^^^ 
the matrix composed of the /i-th wavelet scale of the data 
X. In what follows, the indexing W^^^[/c] will denote the 
square patch of size p centered about pixel k. Following the 
multiscale local model, a patch-based representation of the 
data at scale /i and location k is modeled as follows : 



w(^) [k] = a(^) [k]S^^^ [k] + n(^) [k] 



(22) 



A direct extension of mGMCA to solve this problem 
would simply amount to applying GMCA independently 
to each patch at location k and each scale /i. However, this 
approach would suffer from certain drawbacks assuming 
that some "optimal" patch size at each scale /i is known 
and fixed in advance. However, fixing a priori the patch 
size is a very strong constraint : the appropriate patch size 
should be space-dependent as well and may vary from one 
region to another. 

This suggests that a trade-off should be made between 
small/large patches which would balance between statis- 
tical consistency (large patches) and adaptivity (small 
patches). This indicates that the choice of the patch-size 
should be adaptive and dependent on the local content of 
the data. Inspired by best basis techniques in multiscale 
signal analysis, an elegant way to alleviate this pitfall is 
to perform GMCA at each wavelet scale fi with various 
patch sizes in a quadtree decomposition. In a nutshell, 
GMCA is first performed on the full field to obtain a 
first estimator of the mixing matrix denoted by A'j[k] in 
Figure [2] The field is further decomposed into 4 identical 
non-overlapping patches on which GMCA is applied to 
provide a set of mixing matrices denoted A^[k] in the 
figure. The process is iterated until the patch is equal to 
the desired patch size p^. The number of analysis levels is 
equal to L^. Interestingly, the same area of the sky being 
analyzed at different mixture scales {i.e. patch sizes), it 
makes it possible to choose afterwards the "optimal" patch 
size from the different estimates obtained for / = 0, • • • ,1/^. 
It therefore allows for more degrees of freedom to select 
an adapted patch size at each location. An exhaustive 
description of the parameter selection strategy is given in 
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Fig. 2. Multichannel quadtree decomposition. This figure 
displays the tree-based decomposition of the multichannel 
data W(^). A sequence of mixing matrices are estimated on 
patches with dyadically decreasing size. Large size patches 
are used as a warm startup for the estimation process on 
smaller size patches. 



Appendix [Aj 



4.3. The L-GMCA algorithm 

The Local GMCA (L-GMCA) algorithm can be described 
as follows : 



back and forth spherical harmonic transform (of asymp- 
totic complexity Npix^^'^) followed by two wavelet trans- 
form for each HEALPix face (of asymptotic complexity 
Npix\og{Npix))^ and a weighting of each coefficient map 
at each scale and each resolution level (of asymptotic com- 
plexity Npix). This should be multiplied by the number of 
Monte-Carlo simulations. We developed a C++ parallelized 
program using OpenMP. As an illustration of how crucial 
computer speed might be, on a shared- memory multipro- 
cessor system containing 8 Six-Core AMD Opteron(tm) 
running at 2.4 GHz, recovering the CMB map from one 
Monte- Carlo simulation takes about 9 minutes when 24 pro- 
cessors are used. 



5. Experiments 

5.1. Simulations and comparisons 

In early 2013, the finest - low noise, high resolution - CMB 
data will be Planck data. Therefore, in this section, we pro- 
pose evaluating the performances of the proposed approach 
on simulated Planck data. To the best of our knowledge, the 
only publicly available full-sky simulated Planck data are 
the dataset the used to evaluate preliminary performances 
of various CMB map estimation techniqu es in 2 008. T his 
pre-launch study has been published in Leach (2008). In 
the sequel numerical experiments will be performed using 
exactly the same Planck simulated dataset. Before going 
deeper into the description of the numerical results, the 
two following paragraphs clarify the astrophysical content 
of the simulation as well as the evaluation protocol we opted 
for. 



1. Compute the wavelet decomposition of the data. 

2. For each wavelet scale /x = 1, • • • , N/j^ : 

2.1 Put the C;x active channels to the same targeted 
resolution 

2.2 For each local k : 

2.2.1 - Compute a set of mixing matrices at different 
patch sizes (L^ levels of analysis) 

2.2.2 - Select the optimal mixing matrix 

2.2.3 - Compute the CMB map estimate at scale /i 
and location k. 

3. Reconstruct the CMB map following Equation |21| 



Some words about the simulations 

described 



The Planck-like dataset 



in [Leach ( 2008| ) has been simulated using an early 
version of the Planck Sky Model (PSM) developed by J. 
Delabrouille and collaborator^ In 2008, the PSM included 
most astrophysical signals and foregrounds as well as sim- 
ulated instrumental noise and beams. In details, the simu- 
lations were obtained as follows : 

— Frequency channels : the simulated data are 
comprised of the 9 LFI and HFI channels at fre- 
quency 33, 44, 70, 100, 143, 217, 353, 545, 857 GHz. 
The frequency-dependent beams are perfect isotropic 
Gaussian PSF with FWHM ranging from 5 arcmin at 
217,353,545,857 GHz to 33 arcmin at 33GHz. Real 
data differ from the model as the scanning strategy 
adopted by full-sky surveys such as WMAP or Planck 
is likely to produce non-isotropic and spatially varying 
beams which are more elongated along the scanning 
direction. 



Once the mixing matrices are estimated with L-GMCA 
(steps 2.2.1 and 2.2.2 of the algorithm), estimating the 
CMB map requires to perform both a wavelet-vaguelette 
decomposition and a weighting of the wavelet coefficient of 
the multifrequency data at each wavelet scale according to 
the pseudo- inverse values of the estimated local mixing ma- 
trix. This sequence of operations is particularly computer 
intensive for high resolution data {e.g. Npix 50 million 
pixels for Planck data for each of the 9 frequency maps), 
requiring for each resolution level and each channel two 



— Instrumental noise : in full-sky surveys, the sky 
coverage is generally not uniform : some areas are more 
often observed than others. To some extent, this entails 
that the instrumental noise variance is not homogenous 
as well. In statistical signal processing, such kind of 
statistical process is better known as heteroscedas- 
tic noise. A second effect of the scanning strategy 

^ For more details about the P SM, we invite the 
reader to visit the PSM website : | http:/ / www, ape. univ- 
^paris 7.fr/r^ delabrou/PSM/psm. html^ 
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is the correlation of the noise along the scanning 
direction. In the sequel, simulations account for the 
non-homogeneity of the noise but noise samples are as- 
sumed to be uncorrelated. Either for WMAP or Planck, 
the noise statistics {i.e. noise variance map for each 
frequency channel) are assumed to be known accurately. 

Cosmological Microwave Background : the CMB 

map is a correlated random Gaussian realization 
entirely characterized by its power spectrum. In the 
simulations, the CMB power spectrum is identical to 
that of WMAP. Higher order multipoles are based on 
a WMAP best-fit power spectrum at high £. Note that 
the simulated CMB is Gaussian; no non-gaussianity 
{e.g. lensing, ISW, fNL) has been added. This will allow 
for non-gaussianity tests under the null assumption in 
the sequel. 

Dust emissions : the galactic dust emissions is 
composed of two distinct dust emissions : thermal 
dust and spinning dust {a.k.a. anomalous microwave 
emissio n). Thermal dust is modeled via the Finkbeiner 
model dFinkbeiner et al.| [ 1999) which assumed two 



introduced in this paper. 

Since WMAP, ILC and its extensions ("Delabrouille et al.l 



hot /cold dust populations contribute to the signal in 
each pixel. The emission law of thermal dust varies 
across the sky. 

Synchrotron emission : The synchrotron emission, 
as simulated by the PS M, is an extrapolati on of 
the Haslam 408MHz map fHaslam etliI]|1982D. The 



emission law of the synchrotron emission is an exact 
power law with a spatially varying spectral index. 

Free-free emission : the spatial geometry of free-free 
emission is inspired by the Ha map built from the 
SHASSA and WHAM surveys. The emission law is a 
perfect power law with a fixed spectral index. 

Point sources : infrared and radio sources were 
added based on existing catalogues at that time. In the 
following, the brightest point sources will be masked 
prior to the results evaluation. 

Sunyaev Zel'Dovich effect : thermal SZ is modeled 
in the simulations. 



Small scales of the foreground maps were extrapolated 
to the Planck resolution. More details about the simulations 



can be found in Leach ( 2QQ8[ ). 



Comparison protocol : In 2008, most of the estimated 
CMB maps used for comparisons were quite heterogenous : 
they did not necessarily share the same beams. As a 
consequence, precise and quantitative evaluations were 
very hard to carry out. However, with the exception of 
GMCAn non e of the codes used to estimate CMB maps in 



#1 no] 

[[ poos 



Leach p008 ) is publicly available. 

The major objective of this paper is to evaluate the impact 
of the local and multi scale mixture model as well as 
the sparsity-based component separation technique we 



[Rem azeilles et al. 2011^ have taken the lion's share 
11- sky CMB data analysis; it is therefore very popular 



GMCA 



available 



at 



this 



address 



http:/ / jstarck.free.fr /is ap.html\ 



2009 
for fi 

in the astrophysics community. Like most component sepa- 
ration techniques in cosmology, ILC relies on second order 
statistics (more precisely, minimization) to estimate the 
CMB map. For this reason, we first chose to compare two 
different separation methods : GMCA (based on sparsity) 
and ILC (based on second order statistics). Furthermore, 
we believe that one of the contributions of this paper is the 
use of the local modeling in the wavelet domain; we will 
also evaluate the performances of methods along with a 
global mixture model (the corresponding methods will be 
ILC and GMCA) as well as with the local mixture model 
in wavelets (L-GMCA). 

The local and multiscale model requires the definition of 
4 parameters : i) the number of sources is set to be equal to 
the number of channels, ii) the number of wavelet scales, iii) 
the number of quad-tree decomposition levels and iv) 
the nominal patch size p^. All these parameters are given 
in Table 5.1 There is no automatic strategy to optimally 
select these parameters. We would like to notice that a 
non-exhaustive series of experiments; the particular choice 
of Table |5.1| turned out to be provide lower power spec- 
trum bias, especially at high NIL C (Needlet-ILC) is also 
perfor med Delabrouille et al. (2009); Basak & Delabrouille 
([ 20121 ). 

In the sequel, a 92% common mask, which preserves a very 
large part of the sky, has been defined as the union of a 
small galactic mask and point sources mask which has been 
derived from the brightest point sources. 

5.1.1. CMB map estimation 

In this paragraph, we mainly focus on the quality of estima- 
tion of the full- sky CMB map. Figure [3] display the input 
CMB at a resolution of 5 arcmin on top and the maps we 
obtained by performing ILC, GMCA, NILC and L-GMCA. 
More interestingly. Figure |4] show the residual maps we de- 
fined as the difference between the estimated maps and the 
input CMB map at 5 arcmin. These error maps have further 
been filtered at the resolution of 45 arcmin to filter out noise 
contamination which dominates at high frequency; this al- 
lows to better unveil the low frequency foreground residuals 
which remain in the final estimates. The ILC residual map 
(top-left picture) clearly exhibits large-scale features which 
are reminiscent of the synchrotron emission (negative bulb 
centered about the galactic center) as well as free-free emis- 
sions relics (well-known positive free-free on the right of the 
residual map) and dust emission. The NILC residual show 
significant large scale "blobby" effects which may be related 
to the use of needlet filters used for the analysis rather than 
more space-localized wavelet filters. Visually, GMCA, and 
L-GMCA seem to have similar remaining foreground con- 
tamination which is evocative of dust emission. 
Figure [5] shows the power spectra of the CMB maps. These 
spectra seem very similar up to £ = 500. At smaller scales, 
instrumental noise is the dominant signal. Dotted lines fea- 
ture the noise power spectra. Global methods, and espe- 
cially ILC, exhibit a large noise level at high ^; these meth- 
ods do not account for the beam variation across chan- 
nels. Therefore, ILC and GMCA are mainly sensitive to the 
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Band ji 


# common resolution 


# sources 


# quad-tree levels : 


Nominal patch size : 


1 


33 arcmin 


9 


3 


64 


2 


24 arcmin 


8 


3 


64 


3 


14 arcmin 


7 


3 


32 


4 


10 arcmin 


6 


3 


32 


5 


5 arcmin 


5 


3 


16 



Table 2. This table details the parameters used in local and multiscale sky model. The number of wavelet-vaguelette 
scales is = 5. 



largest scales and do not carefully deal with noise contam- 
ination at smaller scales. Space/scale localized approaches 
like L-GMCA and NILC have lower noise contamination 
with only slight differences between them. It is remarkable 
that, if NILC is expected to exhibit low noise contamina- 
tion {i.e. it relies on the second order statistics of the data), 
L-GMCA is also well designed to provide low noise contam- 
ination levels. 



5.1.2. Power spectrum estimation 

Measuring the power spectrum of the CMB map is of ut- 
most importance for cosmology; crucial information can be 
derived to constrain the cosmological parameters. Power 
spectrum estimation is generally a strenuous problem : 
a precise astrophysical analysis of the CMB power spec- 
trum has generally to be carried out to carefully evaluate 
its precise value and confidence intervals on these values. 
The common approach generally consists in estimated the 
CMB power spectrum from the frequency channels with 
very large masks to reduce the impact of galactic fore- 
grounds. The use of a single channel allows for more handy 
studies of contaminants impacts {e.g. point sources, cos- 
mological infrared background). However, the mandatory 
use of very large masks dramatically increases estimation 
errors at large-scales. The main advantage of source sep- 
aration is its ability to clean large areas of the sky which 
makes possible the use of smaller masks. The final being a 
non trivial combination of the frequency channels, studying 
error propagation is generally more complex. Fortunately, 
the linearity of the proposed methods makes possible the 
use of Monte- Carlo simulations to further study how fore- 
grounds contaminate the final CMB map. 
We carried out comparisons of the CMB power-spectrum 
which can be computed from the CMB maps estimated 
by these 4 methods. These experiments showed that the 
local/multiscale mixture model yields less bias but did 
not reveal significant differences between ILC-based and 
sparsity-based component separation techniques. These 
non-conclusive results can be explained by the low sensi- 
tivity of the power spectrum to slight foreground / non- 
ga ussian c ontamination. Interestingly, the results published 
in (|Leach|r 2QQ8 ) were quite comparable at all scales thus 
also leading to non-conclusive comparisons at the level of 
the CMB power-spectrum. 



5.2. Foreground contamination 

An accurate measure of foreground contamination is the 
cross power spectrum between the residual map and 
the maps of the input foregrounds. We evaluated the 



cross power-spectra of foreground templates used in the 
simulations with the error maps defined by the difference 
between the input CMB map {a.k.a. the true CMB map) 
and estimated CMB maps. These cross power-spectra have 
been performed with synchrotron emission in Figure |6j 
Free- free in Figure [7| dust emission in Figure [8] and SZ 
effect in Figure [9j The same study has been carried out 
with synchrotron but it did not show significant differences 
between the four methods. 



Cross-powspec residual/synchrotron emission 




Cross-powspec residual/synchrotron emission 




Fig. 6. Cross-power spectra of the estimated CMB map 
with the synchrotron template. 



The conclusion we can draw from this cross-correlation 
study are the following : 

1. Accurately accounting for the beams : The 

heterogeneity of the data {i.e. the beam varies across 
frequencies) is better accounted for. As explained in 
Section |3.1[ an inaccurate modeling for the beams 
increases the complexity of the mixtures; even if the 
emission law of a given component is spatially invariant. 



11 



J. Bobin, J.-L. Starck, F. Sureau and S. Basak: Sparse component separation for accurate CMB map estimation 



Input CMB in mK 





-0.50 ^^^^^^^^^^^^^^^^^^^H Q.50 



Fig. 3. Input (top) and estimated CMB maps in mK. 

the variability of the beam across frequencies make 
this emission law variant across multipoles in spherical 
harmonics. This is especially striking for the SZ com- 
ponent in Figure [9]: its emission law is exactly known 
and fixed in both GMCA and L-GMCA. However the 
two methods show radically different cross-correlation 
with the SZ template. Fixing the SZ emission does not 
guarantee that its contamination level vanishes if the 
beam is not properly taken into account. 
In these simulations, the electromagnetic spectrum 
does not vary across the sky; assuming the beam does 
not change across frequencies, its estimation should be 
made efficiently at all scales with methods based on 
global mixture models. However, as shown in Figure [7| 
GMCA and ILC exhibit a high level of Free-free 
contamination; especially for i > 100 in the case of 
ILC. Again, the ability of the proposed modeling to 
account for heterogeneous beams allows for a lower 
free-free contamination level. 




-0.50 ^^^^^^^^^^^^^^^^^^^H 0.50 



2. A more flexible modeling : whether NILC or 
L-GMCA, the local and multiscale mixture model we 
introduced in this paper allows for more degrees of 
freedom to better analyze complicatedly mixed com- 
ponents. In all the correlations computed so far, the 
techniques based on this modeling have outperformed 
the methods using the simple (but commonly used) 
global linear mixture model. This is particularly the 
case for synchrotron. Free-free and dust emissions in 
Figure [6j [7| and Figure [8] where methods based on 
the local/multiscale model exhibit lower foreground 
contamination. 



3. Sparsity versus second-order statistics : the large 
SZ contamination of ILC-based methods enlightens 
the low efficiency of second-order statistics to capture 
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Fig. 4. Residual CMB maps. These maps are defined as tfie difference between tfie estimated maps and tfie input CMB 
map. Units in mK. 



Power spectra 




500 1000 1500 2000 2500 3000 



Fig. 5. Power spectrum of tfie maps. The dotted hue is power spectrum of the input CMB map at the resolution of 5 
arcmin. Abscissa : spherical harmonics multipoles i. Ordinate : Power spectra value - units in £{£ + l)C^/(27r) mK^ 



non-gaussian foregrounds. Conversely, sparsity-based 
component separation techniques are much more 
effective at separating the CMB and non-gaussian 
contaminants. The sensitivity of sparsity to higher- 
order statistics is very likely at the origin of the lower 
synchrotron and dust contamination of L-GMCA for 
£ > 1000. 



5.3. Non gaussian contamination 

The CMB we used in these simulations is a perfect corre- 
lated Gaussian random field; precisely, it contains no trace 
of non-gaussianity whether it may be ISW, lensing or fATL- 



CMB non-gaussianities will evidently originate form spuri- 
ous foreground contaminations. In this paragraph, we pro- 
pose evaluating the level of (non) Gaussianity of the esti- 
mated CMB. The CMB being perfectly Gaussian in these 
simulations, such a study will give a different measure of 
the contamination level. 

Common non-gaussianity tests consist in computing the 
higher-order statistics of the estimated resid ual maps in 
the wavelet domain (Starck fc Murtagh| |2006 ). The aim of 
this study is to give a different way to measure foreground 
contamination. Therefore, we rather opted for the evalua- 
tion of the non-gaussianity (NG) level of the residual maps 
instead of the CMB. This has the crucial advantage of pro- 
viding a contamination level measure insensitive to CMB 
fluctuations. 
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Fig. 7. Cross-power spectra of the estimated CMB map 
with the free- free template. 

Cross-powspec residual/dust emission 




lO'L 



10 100 1000 

Cross-powspec residual/dust emission 




10' !^ ■ ... I ■ ... I ■ 

500 1000 1500 2000 



Fig. 8. Cross-power spectra of the estimated CMB map 
with the thermal dust template. 



For this purpose, several statistical tests were performed : 
the kurtosis {i.e. statistics of order 4) is calculated in the 5 



component separation for accurate CMB map estimation 
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Fig. 9. Cross-power spectra of the estimated CMB map 
with the SZ template. 

first wavelet scales of the residual maps. These results are 
shown in the top pictures of Figure [To) Except at very large 
scale {£ < 200), these two measures of non-gaussianities 
draw very similar conclusion : i) in the first wavelet scale, 
all the three methods are likely to be compatible with a 
Gaussian contamination; this largely originates from the 
preeminence of noise in this range, ii) the sparsity -based 
separation criterion used in L-GMCA yields lower NO lev- 
els in the scales 2 to 4 (i.e. 200 <i< 1600). At large scale, 
L-GMCA has a lower kurtosis value. At that stage, mas- 
sive simulations of all the components (and not only noise) 
would be mandatory to precisely evaluate the performances 
of these methods at low £. 

The bottom graphics of Figure [To] features the value of the 
kurtosis in fixed wavelet scales but in different equi-angular 
bands of latitude. By convention, the data were in galac- 
tic coordinates; latitude thus corresponds to the galactic 
plane. From these pictures, L-GMCA is likely to be less 
contaminated than the other three residual maps. More sig- 
nificantly, GMCA exhibits lower NG contamination at all 
latitudes in the scales 2 to 4. This enlightens the role of 
GMCA and more precisely the use of sparsity to better ex- 
tract non-gaussian sources. It is very likely that the second 
order statistics used as a separation criterion in ILC is less 
sensitive that sparsity to separate foregrounds. This un- 
veils the positive impact of the local and multiscale model 
together with a sparsity-based separation criterion as used 
in L-GMCA. 

6. Discussion - incorporating astrophysical models 

Except for CMB and SZ, no precise astrophysical model 
is currently accounted for within GMCA. Remarkably, in- 
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Kurtosis per wavelet scale 
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Kurtosis - 1st wavelet scale 



Kurtosis - 2nd wavelet scale 
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Fig. 10. Kurtosis of the estimated CMB maps in the wavelet domain. Top : per wavelet scale. Per latitude - top-left 
panel : first wavelet scale, top-right panel : second wavelet scale, bottom-left panel : third wavelet scale, bottom-right 
panel : fourth wavelet scale. Dashed lines are 3a confidence interval computed from 25 random noise realizations. 



corporating more precise modeling of some of the compo- 
nents would mainly consist in adding up ad ditive constraint 
to the problem in Equation 13 In Leach (2008), a model- 
based extension of GMCA was also partly evaluated. More 
precisely the "model GMCA" modeled the emission law of 
the synchrotron and free-free emission as power laws with 
fixed but unknown spectral index. Interestingly, the CMB 
map estimate we obtained at that time seemed particularly 
less contaminated by these low frequency foregrounds at 
the expense of a significantly large noise level. However, as 
shown in this paper, the first attempt to model for more 
sophisticated astrophysical prior models were kind of use- 
less efforts as long as the variation of the beam across scale 
was not properly accounted for. The model introduced in 
this paper should allow for a more effective implementa- 
tion of these models : i) the beam issue is now properly 
tackled, ii) the local/multiscale model offers a privileged 
framework to model for the natural multiscale variation of 



models' parameters such as the spectral index of the syn- 
chrotron emission and the temperature/spectral index of 
the thermal dust emission. Beyond models, the use of an- 
cillary data {e.g. Haslam map for the synchrotron emission, 
IRAS map for the thermal dust emisson. Ha map for the 
free-free emission) within GMCA should be possible. It is 
not clear yet how to proceed with a extension of L-GMCA 
to a model-based component separation techniques. Two 
approaches could be envisioned : i) implementing astro- 
physical modeling to jointly estimate the components' maps 
as well as model parameters at the expense of an increase of 
the numerical complexity of the estimation problem, ii) al- 
ternating between L-GMCA with fixed parameters for the 
foregrounds and a template fitting/parameter estimation 
for parameter estimation. This is out of the scope of this 
study and is reserved for future work. 
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Software 

Source separation techniques used in cosmology and more 
precisely for CMB map estimation are generally high-end 
method which are seldom publicly shared and available on- 
line. T he L-GMCA code will be made publicly available at 
http://www.cosmostat.org/lgmca.html 

7. Conclusion 

The estimation of a high precision CMB map featuring 
low noise and low foreground contamination is of crucial 
interest for the astrophysical community. This problem is 
customarily tackled in the framework of component sepa- 
ration. As any estimation problem, the accurate modeling 
of the data is essential. However, a close look at the as- 
trophysical phenonema at play in the CMB data such as 
WMAP or Planck reveals that the linear mixture model 
used so far by common component separation methods 
in cosmology does not hold. First, the variation of the 
beam across scale is seldom accounted for which highly lim- 
its the performances of these component separation meth- 
ods especially at small scales. More importantly, the linear 
mixture model does not afford enough degrees of freedom 
to precisely capture the complexity of the data including 
the variability across space of the emission law of some 
of the components of interest. To alleviate these limita- 
tions, this paper introduces a new modeling of the com- 
ponents' mixtures using a multiscale and local decomposi- 
tion of the data in the wavelet domain. We introduced a 
novel sparsity-based coined L-GMCA (Local - Generalized 
Morphological Component Analysis) which makes profit of 
the proposed local/multiscale mixture model. In the pro- 
posed framework, wavelet-based multiscale analysis allows 
for a precise modeling of the beam evolution across chan- 
nels. Capturing the variations across pixels of the emissivity 
of components is carried out by partitioning each wavelet 
scale with adaptive patch sizes. Extensive numerical ex- 
periments have been carried out which show the superior- 
ity of the proposed modeling and separation technique to 
provide a clean, low-foreground CMB map estimate. More 
precisely, we showed that the local and multiscale model- 
ing allows for improved separation results whether it is used 
with separation techniques as different as ILC and GMCA. 
Additionally, the numerical experiments enlightens the dra- 
matic positive impact of the use of sparsity in L-GMCA to 
provide less galactic foreground contamination as well as 
significantly lower non-gaussianity levels. 
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Appendix A: Multiscale local mixture model 

Before going further into the description of the method, 
principles of the multichannel quadtree decomposition and 
notations have to be discussed. In multiscale image analy- 
sis, quadtree decomposition amounts to decomposing each 
wavelet scale in patches with dyadic sizes starting from the 
original field itself and sequentially subdivide each patch 
in 4. To our knowledge, this multiscale analysis procedure 



has never been extended to analyze multichannel data in a 
source separation framework. In this specific context, such 
an extension consists in estimating the mixture parameters 
{i.e. the mixing matrix A and the sources S) at each scale 
/i on patches with various patch sizes starting from large 
patches of size 2^^p^ to smaller patches of size p^. The pa- 
rameter Ljj^ denotes the number of decomposition levels or 
subdivisions in which the estimation of the mixture param- 
eters is carried out. 

Figure [2] clearly illustrates the principle of the decomposi- 
tion : the analysis {i.e. estimation of the mixing matrices) 
is first performed on the largest patch size allowed 2^^'p^ : 

W^^^[/c]. In the next step, the patch W^^^[/c] is divided 

into 4 non-overlapping patches of size 2^^^~^p^ on which 
the parameters estimation is carried out. The same process 
is performed until the final patch size p^ is attained. 
The notation we will be using in the sequel is the following : 

— Multichannel patch w[^^[/c] : Is the concatenation 
of frequency patches of size 2^p^ x 2^p^ as defined 
in Figure [2] centered on pixel k. We invite the reader to 
notice that the three indices of the notations ^l^\k] 
underlines the dependence of this variable on the : i) 
the pixel at which the analysis is carried out, ii) the 
wavelet scale /i and quadtree decomposition level 

— Multichannel patch A\^\k] : Mixing matrix esti- 
mated from the multichannel patch wj^^ [k] . 

Interestingly, the same area of the sky being analyzed 
at different mixture scales {i.e. patch sizes), it makes it 
possible to choose afterwards the "optimal" patch size 
from the different estimates obtained for / = 0, • • • , L^. It 
therefore allows for more degrees of freedom to select an 
adapted patch size at each location. 

Full-sky CMB data (e.g. WMAP, P lanck) are generally 
sampled on the Healpix sampling grid (Gorski et a l.||2QQ5 ). 
For the sake of simplicity and computational efficiency, the 
most straightforward way to decompose an Healpix map 
into patches is to decompose each Healpix face. 

Choosing the "optimal" patch decomposition 

For each patch location /c, L-GMCA provides a sequence of 
parameters or more specifically mixing matrices computed 

at different mixtures scales : | a1^^ \k] \ . These mix- 

l ^ J l=0,--,L^ 

ing matrices have been computed from patches of increas- 
ing sizes; this entails that they offer different views of the 
data in the neighborhood of pixel k. As emphasized in the 
previous section, this allows for extra flexibility to choose 
adapted local parameters : i) mixing matrices estimated 
from larger areas will be more adapted to stationary ar- 
eas, ii) mixing matrices evaluated from local patches anal- 
ysis are well-suited to better capture local variations of the 
foreground emissivity. The difficulty now mainly consists in 
defining a strategy to select the optimal local estimator of 
the mixing matrix {i.e. optimal amongst the set of available 
estimators). 

First, it has to be noticed that in the framework of GMCA 
and similarly in L-GMCA, once the mixing matrices is esti- 
mated, the final estimate of the components are computed 
by applying the Moore pseudo-inverse of the mixing matrix 
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to the data : 



This way the sources linearly depend on the original data 
W^; this is particularly important to control error propa- 
gation in the final CMB map estimate. 
Discriminating between the available mixing matrices at lo- 
cation k amounts to defining a criterion sensitive the a lo- 
cal mis-estimation of the CMB map. From the sequences of 

available mixing matrices {aJ^^ [/cjj/^o,--- one can com- 
pute a sequence of various local CMB map estimates about 
pixel k at the level of the smallest patch of size . These es- 
timates are computed by applying the pseudo- inverse of the 

different matrices {a[^^[A:]}/^o,- - to the smallest data 

patch about pixel /c, W^^^q[A:] : 



Quite naturally, the optimal mixing matrix amongst the 



available estimators 



should be chosen 



such that the estimated CMB is the least contaminated 
possible. Here contamination is taken in the wide sense: 
foreground and noise. One remarkable property of the CMB 
is its decorrelation with the instrumental noise and all other 
astrophysical components. More quantitatively, owing the 
linearity of the estimation process, the estimated CMB can 
be modeled as : 

where s^^^ stands for the estimated CMB in scale /i, s^^"^^ 
the sought after clean CMB, the residual coming from 
the foregrounds contaminants and n^^^ the instrumental 
noise. As the CMB is decorrelated with both the residual 
and noise, the variances of the three terms in the above 
equation add up : 

Var {s^^^ } = Var {s^^^"^ } + Var {r^^^ } + Var {n^^^ } 

As a consequence, a solution with higher noise or fore- 
ground contamination will have a higher variance. It is then 
natural to choose the mixing matrix amongst the sequence 

|a[^^[A:]| which yields the local CMB estimate 

with the lowest variance. This suggests that the optimal 
mixing matrix and local CMB map should be chosen as 
follows : 



[k] = Argmin^^o 
where s[^q[A:] = 



[kfwl^lik]]^ 



(Al) 
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where the operator [ . ] extracts the CMB out of the entire 
set of estimated components. 

Other estimator selection criteria can be envisioned based 
in statistical characteristics of the CMB map such as its 
gaussianity. Foreground contamination are very likely to in- 
crease the non-gaussianity level of the estimated CMB map. 
Non-gaussianity measures such as higher-order cumulants 
could be used to discriminate between various local esti- 
mates of the CMB. However, these measures would be less 
sensitive to the noise contamination. 
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